Speed up telemetry check by sharing a single TS program across roots !! by shahzad31 · Pull Request #267651 · elastic/kibana

shahzad31 · 2026-05-05T07:21:36Z

Summary

The CI telemetry check (node scripts/telemetry_check) was creating a separate TypeScript program (ts.createProgram + getTypeChecker()) for each of the 10 telemetry roots. Since each program independently resolves the same shared Kibana transitive dependencies, this resulted in ~212s of redundant sequential CPU work.

This PR:

Creates a single shared TS program for all 69 collector files across all roots, then partitions the parsed results back by root
Parallelizes globbing across all roots via Promise.all
Extracts filterCollectorPaths and extractCollectorsWithProgram as reusable functions

Benchmark results (local, 3 consistent runs)

Metric	Before	After	Improvement
Total wall time	210s	59s	-151s (72%)
TS type-check time	212s (6 programs)	50s (1 program)	-162s
Glob phase	0.7s (sequential)	0.3s (parallel)	-0.4s

Validation

node scripts/telemetry_check — passes (no changes mode)
node scripts/telemetry_check --fix — passes, correctly detects and fixes schema drift
schema_checks.test.ts — all 13 tests pass
kbn-telemetry-tools unit tests — all 8 suites / 45 tests pass
Tested with a real collector change (added field to CSP collector) to verify end-to-end detection, fix, and schema JSON update

Test plan

CI telemetry check passes on this PR
Verify telemetry check correctly detects schema drift when a collector is modified
Verify --fix correctly updates the JSON schema files
Verify --path flag still works for scoped checks

Made with Cursor

Previously each telemetry root (10 total) created its own ts.createProgram + getTypeChecker(), redundantly resolving the same shared Kibana transitive dependencies. This accounted for ~212s of sequential CPU time. This change collects all collector file paths across all roots upfront, creates a single shared TypeScript program, then partitions the parsed results back by root. Globs also now run in parallel via Promise.all. Local benchmark: 210s → 59s (~72% reduction). Co-authored-by: Cursor <cursoragent@cursor.com>

kibanamachine · 2026-05-05T09:04:14Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: 5c81992

Failed CI Steps

FTR Configs #118

Test Failures

[job] [logs] FTR Configs #118 / Agent Builder agents Edit agent should edit agent name

Metrics [docs]

✅ unchanged

History

💔 Build #437886 failed 7890f8b
💔 Build #437877 failed b82c06e

infra-vault-gh-plugin-prod · 2026-05-05T11:10:16Z

Pinging @elastic/actionable-obs-team (Team:actionable-obs)

macroscopeapp · 2026-05-05T11:13:42Z

Approvability

Verdict: Needs human review

Performance optimization refactor for internal telemetry tooling. All changed files are owned by @elastic/kibana-core and the author is not a designated owner, so designated code owners should review.

^{You can customize Macroscope's approvability policy. Learn more.}

afharo

Thank you for the speed boost. I added 2 thoughts of potential additional improvements.

Happy to approve if you prefer to address those on a follow-up PR.

afharo · 2026-05-05T23:10:59Z

-        const restrictedProgramPaths = programPaths.filter((programPath) =>
-          fullRestrictedPaths.includes(programPath)
+  return [
+    {


This is awesome! I wonder if we can make it even faster by defining concurrent tasks:

One task that gets the program

A set of parallel tasks that run extractCollectorsWithProgram with the shared program created in 1.

Thanks for the suggestion! I benchmarked this — extractCollectorsWithProgram across all 10 roots takes 0.06s total (most roots have 0-5 collectors, the largest two have 26 and 30). Since it's purely CPU-bound synchronous work (AST traversal + type checker lookups), Promise.all in single-threaded Node.js wouldn't actually parallelize it — it would just interleave the synchronous generators sequentially. True parallelism would need worker_threads, but the ts.Program can't be serialized across threads.

The bottleneck is createKibanaProgram at 53s (95.9% of the task). Extraction is negligible by comparison, so I'll leave this as-is.

afharo · 2026-05-05T23:12:04Z

 }

+export function filterCollectorPaths(fullPaths: string[]): string[] {
+  return fullPaths.filter((p) => COLLECTOR_RE.test(readFileSync(p, 'utf-8')));


I think that there's the potential of making this function async (and could cut more time).

Good call — filterCollectorPaths currently uses readFileSync on all 36,308 globbed files and takes 1.79s. Making it async with fs.promises.readFile + Promise.all would overlap the I/O and could cut that roughly in half.

That said, it's ~3% of the total task time (53s is createKibanaProgram), so the absolute savings would be ~1s. Happy to do it in a follow-up for cleanliness!

kibanamachine · 2026-05-06T15:59:12Z

Starting backport for target branches: 8.19, 9.3, 9.4

https://github.com/elastic/kibana/actions/runs/25446329284

kibanamachine · 2026-05-06T16:09:12Z

💔 Some backports could not be created

Status	Branch	Result
❌	8.19	Backport failed because of merge conflicts
✅	9.3
✅	9.4

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 267651

Questions ?

Please refer to the Backport tool documentation

…roots !! (#267651) (#268009) # Backport This will backport the following commits from `main` to `9.4`: - [Speed up telemetry check by sharing a single TS program across roots !! (#267651)](#267651)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)  Co-authored-by: Shahzad <shahzad31comp@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>

…roots !! (#267651) (#268008) # Backport This will backport the following commits from `main` to `9.3`: - [Speed up telemetry check by sharing a single TS program across roots !! (#267651)](#267651)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport)  Co-authored-by: Shahzad <shahzad31comp@gmail.com> Co-authored-by: Cursor <cursoragent@cursor.com>

…!! (elastic#267651) ## Summary The CI telemetry check (`node scripts/telemetry_check`) was creating a separate TypeScript program (`ts.createProgram` + `getTypeChecker()`) for each of the 10 telemetry roots. Since each program independently resolves the same shared Kibana transitive dependencies, this resulted in ~212s of redundant sequential CPU work. This PR: - **Creates a single shared TS program** for all 69 collector files across all roots, then partitions the parsed results back by root - **Parallelizes globbing** across all roots via `Promise.all` - **Extracts `filterCollectorPaths` and `extractCollectorsWithProgram`** as reusable functions ### Benchmark results (local, 3 consistent runs) | Metric | Before | After | Improvement | |--------|--------|-------|-------------| | Total wall time | **210s** | **59s** | **-151s (72%)** | | TS type-check time | 212s (6 programs) | 50s (1 program) | -162s | | Glob phase | 0.7s (sequential) | 0.3s (parallel) | -0.4s | ### Validation - `node scripts/telemetry_check` — passes (no changes mode) - `node scripts/telemetry_check --fix` — passes, correctly detects and fixes schema drift - `schema_checks.test.ts` — all 13 tests pass - `kbn-telemetry-tools` unit tests — all 8 suites / 45 tests pass - Tested with a real collector change (added field to CSP collector) to verify end-to-end detection, fix, and schema JSON update ## Test plan - [ ] CI telemetry check passes on this PR - [ ] Verify telemetry check correctly detects schema drift when a collector is modified - [ ] Verify `--fix` correctly updates the JSON schema files - [ ] Verify `--path` flag still works for scoped checks Made with [Cursor](https://cursor.com) --------- Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>

github-actions Bot added the author:actionable-obs PRs authored by the actionable obs team label May 5, 2026

kibanamachine added 2 commits May 5, 2026 07:34

Changes from node scripts/lint.js --fix

7890f8b

Changes from node scripts/eslint_all_files --no-cache --fix

5c81992

shahzad31 marked this pull request as ready for review May 5, 2026 11:10

shahzad31 requested a review from a team as a code owner May 5, 2026 11:10

afharo reviewed May 5, 2026

View reviewed changes

shahzad31 requested a review from afharo May 6, 2026 15:25

shahzad31 added backport:all-open Backport to all branches that could still receive a release and removed backport:skip This PR does not require backporting labels May 6, 2026

afharo approved these changes May 6, 2026

View reviewed changes

shahzad31 merged commit 010a027 into elastic:main May 6, 2026
87 checks passed

shahzad31 deleted the speed-up-telemetry-check-shared-program branch May 6, 2026 15:58

kibanamachine added the v9.5.0 label May 6, 2026

This was referenced May 6, 2026

[9.3] Speed up telemetry check by sharing a single TS program across roots !! (#267651) #268008

Merged

[9.4] Speed up telemetry check by sharing a single TS program across roots !! (#267651) #268009

Merged

This was referenced May 6, 2026

[Scout] Enable lanes test distribution strategy for pull-request & on-merge #264506

Merged

Add accessible name to model selection popover (WCAG 4.1.2) #267622

Merged

kibanamachine added the v9.4.0 label May 6, 2026

kibanamachine added the v9.3.5 label May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up telemetry check by sharing a single TS program across roots !!#267651

Speed up telemetry check by sharing a single TS program across roots !!#267651
shahzad31 merged 3 commits into
elastic:mainfrom
shahzad31:speed-up-telemetry-check-shared-program

shahzad31 commented May 5, 2026 •

edited by kibanamachine

Loading

Uh oh!

kibanamachine commented May 5, 2026

Uh oh!

infra-vault-gh-plugin-prod Bot commented May 5, 2026

Uh oh!

macroscopeapp Bot commented May 5, 2026

Uh oh!

afharo left a comment

Uh oh!

afharo May 5, 2026

Uh oh!

shahzad31 May 6, 2026

Uh oh!

afharo May 5, 2026

Uh oh!

shahzad31 May 6, 2026

Uh oh!

Uh oh!

kibanamachine commented May 6, 2026

Uh oh!

kibanamachine commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

shahzad31 commented May 5, 2026 • edited by kibanamachine Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark results (local, 3 consistent runs)

Validation

Test plan

Uh oh!

kibanamachine commented May 5, 2026

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

History

Uh oh!

infra-vault-gh-plugin-prod Bot commented May 5, 2026

Uh oh!

macroscopeapp Bot commented May 5, 2026

Approvability

Uh oh!

afharo left a comment

Choose a reason for hiding this comment

Uh oh!

afharo May 5, 2026

Choose a reason for hiding this comment

Uh oh!

shahzad31 May 6, 2026

Choose a reason for hiding this comment

Uh oh!

afharo May 5, 2026

Choose a reason for hiding this comment

Uh oh!

shahzad31 May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kibanamachine commented May 6, 2026

Uh oh!

kibanamachine commented May 6, 2026

💔 Some backports could not be created

Manual backport

Questions ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shahzad31 commented May 5, 2026 •

edited by kibanamachine

Loading